Setting individualised goals for people living with dementia and their family carers: A systematic review of goal-setting outcome measures and their psychometric properties

Background Individualised goal-setting outcome measures can be a useful way of reflecting people living with dementia and family carers’ differing priorities regarding quality-of-life domains in the highly heterogeneous symptomatology of the disease. Evaluating goal-setting measures is challenging, and there is limited evidence for their psychometric properties. Aim (1) To describe what goal-setting outcomes have been used in this population; (2) To evaluate their validity, reliability, and feasibility in RCTs. Method We systematically reviewed studies that utilised goal-setting outcome measures for people living dementia or their family carers. We adapted a risk of bias and quality rating system based on the COSMIN guidelines to evaluate the measurement properties of outcomes when used within RCTs. Results Thirty studies meeting inclusion criteria used four different goal-setting outcome measures: Goal Attainment Scaling (GAS), Bangor Goal Setting Interview (BGSI), Canadian Occupational Performance Measure (COPM) and Individually Prioritized Problems Assessment (IPPA); other papers have reported study-specific goal-setting attainment systems. Only GAS has been used as an outcome over periods greater than 9 months (up to a year). Within RCTs there was moderate quality evidence for sufficient content validity and construct validity for GAS, COPM and the BGSI. Reliability was only assessed in one RCT (using BGSI); in which two raters reviewed interview transcripts to rate goals with excellent inter-rater reliability. Feasibility was reported as good across the measures with a low level of missing data. Conclusion We found moderate quality evidence for good content and construct validity and feasibility of GAS, BGSI and COPM. While more evidence of reliability of these measures is needed, we recommend that future trials consider using individualised goal setting measures, to report the effect of interventions on outcomes that are most meaningful to people living with dementia and their families.


Introduction
Dementia is characterised by highly heterogenous symptoms including cognitive impairments and other neuropsychiatric symptoms which impair daily functioning (WHO, 2022).Quality of life is consistently cited by older adults as more important than disease specific outcomes (Tochel et al., 2019) and is included as an outcome in many dementia trials.Because dementia symptoms and domains of quality of life are varied and of differing relevance to people living with dementia and their relatives, there is a focus on patient-reported relevant outcomes measures (PROMs) (Cooper et al., 2012).
The main alternative to standardised scaled outcome measures is to use highly individualised goal setting or goal attainment scaling systems.In this paper we define goal-setting outcome measures as those using a system to set individualised goals (brief statements about a behaviour that the user would like to carry out or achieve) with people living with dementia and/or their family carers, against which attainment can be rated.Most goal-setting measures aim to capture individualised and clinically meaningful outcomes (Shabbir & Sanders, 2014) making them particularly suited for assessing interventions for diseases with heterogenous symptoms and stages such as dementia.
A 2008 systematic review examining the utility of Goal Attainment Scaling (GAS) for people living with dementia reported mixed findings regarding responsiveness, reliability, validity and feasibility (Bouwens et al., 2008).It identified a small number of studies that used GAS, and 9/10 reviewed studies were conducted by the same research group (Bouwens et al., 2008).They concluded that the evidence was not yet strong enough to state that GAS was a suitable for this population but affirmed its potential value of being uniquely able to reflect the multidimensionality of dementia.Dementia trials are now developing a wide range of interventions (including drugs, rehabilitation, psychosocial, environmental, preventative, or different approaches to delivering care) and all are striving to be bolder in their vision for person-centred approaches to dementia care (Kim & Park, 2017).To our knowledge, there has been no more recent, nor broader review of all goalsetting measures for people living with dementia or their family carers.
Goal setting outcome measures differ from conventional PROMs as they lack fixed items.The construct being measured is commonly described as 'the change' or extent to which the goal is achieved because of an intervention on an aspect of the user's life.The number of goals, goal content and attainment levels vary between studies and participants, so validity and reliability are complex to measure (Gaasterland et al., 2019).In this review, we follow the COnsensusbased Standards for the selection of health Measurement Instruments (COSMIN) guidelines (Mokkink et al., 2020), a comprehensive checklist designed to assess the psychometric properties and methodological quality of outcome measures.Many of the COSMIN criteria cannot be evaluated for goal-setting outcome measures, including criterion validity and internal consistency.A review of GAS within drug trials used an adapted version of COSMIN and found that the included trials reported on inter-rater reliability, content validity, construct validity and responsiveness of goal setting measures (Gaasterland et al., 2016).Gaasterland et al. (2019) provide guidelines of how to evaluate content validity, construct validity and inter-rater, intrarater and inter-trial reliability of goal-setting outcomes, which we have used in this review (Table 1).
In this paper we aim to evaluate evidence regarding the utility of goal-setting outcome measures in people living with dementia and their family carers.Our aims are to (a) describe what goal-setting measures have been used with people living with dementia and their family carers; and (b) evaluate the validity (content and construct), reliability (inter-rater reliability and responsiveness) and feasibility of measures that have been used in RCTs.

Methods
We registered our protocol on the Prospective Register of Systematic Reviews (PROSPERO -CRD42021245401) and used PRISMA guidelines (Moher et al., 2015) to conduct and report this review.

Search strategy
In February 2022 we searched CINAHL, Embase, PsychInfo and Medline for studies that used individualised goal focused outcome measures for people with dementia and/or their family carers.The databases were examined using a combination of keywords within three blocks: (1) Dementia, (2) Goals, and (3) Outcome Measures, with synonyms and relevant MeSH headings tailored to each database (full search details in Appendix 1).

Inclusion and exclusion criteria
Titles and abstracts were screened by the first author (J.B.) and 10% were independently reviewed by second reviewer (C.C.) to identify articles where one or more individualised goals were set for people living with dementia and/or their family carers and used as outcome measures for any type of intervention.We included studies where ≥75% of the sample had a diagnosis of dementia.(Mokkink et al., 2010) and goal outcome-adapted COSMIN definitions of measurement properties, and their quality criteria (Gaasterland et al., 2016(Gaasterland et al., , 2019 One researcher (J.B.) then reviewed the full texts to select the final eligible articles, discussing uncertainties with the wider research team.Studies were included where; • Goals were set and rated by either a family carer, person living with dementia, clinician, researcher, or a combination of these.• At least one psychometric property (validity or reliability) was assessed or the feasibility or interpretability of the goal-setting outcomes was reported.
We excluded case studies, dissertation abstracts, protocols, and reviews.

Data extraction
We used Covidence, a web-based collaboration software platform that streamlines the production of systematic reviews, for data management.Duplicate articles were removed.See Table 2 for details of data extracted.

Assessing risk of bias and quality ratings of evidence
We evaluated methodological quality, validity, reliability and feasibility for goal outcome measures within RCTs.To do this, we adapted the COSMIN risk of bias checklist (Mokkink et al., 2018) (Appendix 2), and 'Good Measurement Properties' COSMIN criteria (Mokkink et al., 2010) (Table 1).We adapted Box 2 (Content validity), Box 6 (Reliability) and Box 9 and 10 (Construct validity) from the COSMIN risk of bias checklist using the standards for content and construct validity developed from Gaasterland et al. (2019Gaasterland et al. ( , 2016))'s papers and drew on the definitions and the findings in two previous GAS reviews (Bouwens et al., 2008;Shankar et al., 2020).We distinguished between individual level and trial level construct validity (see Table 1 for definitions).
Appendix 2 shows the full adapted boxes, standards, and rating guide.Each study was independently rated by two of three researchers (J.B, S.Z and A.O).For each study, we evaluated (1) risk of bias of the evaluation as 'very good', 'adequate', 'doubtful', or 'inadequate'; and (2) the quality rating as sufficient, insufficient, indeterminate, and inconsistent.Any conflicting ratings were discussed as a team to reach a consensus rating.The overall study quality was recorded as the lowest rating of any standard within the box following 'the worst score counts' principle of COSMIN (Mokkink et al., 2018).We extracted information related to the interpretability and feasibility of the goal setting outcomes following the COSMIN recommendations (Mokkink et al., 2020).Interpretability refers to the degree qualitative meaning can be assigned to the single scores or change in scores of the measure and included looking at completion rates and the percentage of missing data.The feasibility of the measures includes any details related to the ease of application of the measure including completion time, cost of measure use, training needed and ease of administration (Mokkink et al., 2010).
Finally, we used a modified GRADE approach (Prinsen et al., 2018;Terwee et al., 2018) to give overall ratings for the quality of the evidence (high, moderate, low, very low evidence) for content validity, construct validity and reliability of each goal setting measure.

Search strategy results
As outlined in Figure 1, we identified 33 articles that met the inclusion criteria, which described 30 studies.

Goal attainment scaling
Goal attainment scaling was used to test psychosocial interventions (n = 10), rehabilitation (n = 5) and drug studies (n = 4) (see Table 2).In most studies, clinicians facilitated the goal setting and scored attainment (n = 14), experienced dementia care staff facilitated GAS in five studies and one study used non-clinically trained facilitators (Rapaport et al., 2021).In ten studies, family carers were the primary person involved in GAS but three of these studies included the people living with dementia wherever possible.Five studies explicitly involved both the people living with dementia and family carers.The people living with dementia was the primary person setting goals in two studies; in one of them the family carer was involved if available.Two geriatricians and a nurse collaborated to set goals on behalf of care home residents in one study (Gordon et al., 1999).
Goal attainment scaling formulates individualised scoring scales when setting the goal, usually defining what the baseline level of behaviour would look like if it were to get 'much worse', 'worse',  'better' or 'much better than expected'.All but one study used a 5-point scale to assess goal attainment (usually À2 to +2, although two studies used ratings of À1 to +3 (Wilz et al., 2011;Wilz et al., 2018), while the remaining study used a 3-point scale; 0 (no change) to 2 (completely achieved) (Harris et al., 2020).There was also variation in values ascribed to the scale numbers.The original GAS methodology (Kiresuk & Sherman, 1968) allocates the baseline level at 'À1' or 'À2' with '0' being 'goal achieved'.In eight studies 'zero' was defined as 'goal achieved' or 'expected outcome'.In seven studies, zero was ascribed to the baseline status or current level of functioning, which allows for more levels of deterioration which may be more suitable for degenerative diseases like dementia where decline is more likely (Rockwood et al., 2002).Two studies did not specify the scaling used (Petyaeva et al., 2018;Watchman et al., 2021).
The GAS follow-up periods ranged from 1 week to 12 months.Five studies asked participants to rank goals in order of importance or priority and used rankings to weight scores.One study (Jennings et al., 2018) asked people living with dementia and family carers to rate how difficult they thought their goals would be to achieve on a four-point scale (not at all difficult to extremely difficult).10/19 studies transformed GAS ratings into GAS T-scores using a standardised formula (Kiresuk & Sherman, 1968).Other studies used narrative methods to report on number and type of goals achieved, and attainment levels.

Other goal-setting measures
The COPM was used in two RCTs (Clare et al., 2010;Regan et al., 2017) that test cognitive rehabilitation interventions.COPM provides a semi structured interview format to help users identify goals within selected areas.COPM uses a 10-point scale to measure the level of performance and satisfaction with goal attainment (1; unable to perform/not satisfied, to 10; fully able to perform/extremely satisfied).The 'change score' is calculated by summing the individual goal ratings of performance and satisfaction and then dividing by the number of goals set.In both studies non-clinically trained but supervised research assistants facilitated the goal setting with people living with dementia only.The follow up periods for COPM are shorter than the other measures at 4 weeks and 8 weeks.
The BGSI is used in a series of three studies (Watermeyer et al., 2016) testing cognitive rehabilitation programmes by the research group who developed BGSI (Clare et al., 2015).Like the COPM, it is a semi structured interview and uses the same 10-point scale.It has since been used outside of this research group in a small sample of three Irish patients with Alzheimer's Disease (Kelly et al., 2019).The BGSI is facilitated by researchers in the included studies.Clare et al. (2019) trained research assistants but the training researchers received in the other two studies is unclear.Unlike GAS, both the BGSI and the COPM use standardised, rather than individually tailored scaling systems across studies.The follow-up periods varied from 11 weeks post baseline to 9 months.
Table 2 outlines 5 goal-setting measures that have not been employed in more than one study; five developed their own measure and one non-randomised study (Bemelmans et al., 2016) used the IPPA which is a measure specially developed to assessed the effectiveness of assistive technology (Wessels et al., 2002).

Findings from studies employing goal-orientated measures in randomised control trials
Out of the eleven RCTs, six utilised GAS and five used COPM (n = 2), BGSI (n = 2) or a selfdeveloped method (n = 1).Three of the RCTs using GAS (Boots et al., 2017;Chester et al., 2021;Wilz et al., 2011) were not assessed for quality or psychometric properties since they only used GAS in their intervention group or as part of the intervention.
Quality appraisal of randomised control trials.Table 3 outlines both the Risk of Bias and quality ratings for the goal setting measures in each RCT.Table 4 summarises the overall quality of the measurement properties within the RCTs following the modified GRADE approach.
Content validity.All eight RCTs evaluated content validity and across the studies GAS, COPM and BGSI was rated as sufficient, with a moderate level of available evidence due to the small number of RCTs in total (Table 4).Table 3 shows that sufficient evidence for content validity was reported for two studies using GAS (Leroi et al., 2014;Rockwood et al., 2006) and two studies using the BGSI Table 3. Risk of Bias ratings and measurement property quality ratings within RCTs (Cells highlighted in green indicate the psychometric property was given a good or adequate risk of bias rating and are of sufficient quality).Abbreviations: + = sufficient; À = insufficient; ?= indeterminate; ± = inconsistent.(Clare et al., 2019;Hindle et al., 2018).These studies described the methodology in detail and synthesised the mixed methods.The goal setting and scoring were reviewed by independent experts and both the people living with dementia and family carers were involved in the setting and scoring of goals.There was also a focus on ensuring that goals were based on SMART criteria.The two RCTs using COPM (Clare et al., 2010;Regan et al., 2017) had adequate and sufficient evidence for content validity.Two RCTs indicated doubtful content validity and were of inconsistent quality (Rockwood et al., 1997).The level of content analysis performed on set goals varied between the RCTs, ranging from detailed analysis that outlined the goal domains, areas and examples (Clare et al., 2019;Hindle et al., 2018;Rockwood et al., 2006), to brief summaries of overall goal domains (Clare et al., 2010;Leroi et al., 2014;Regan et al., 2017;Rockwood et al., 1996), to no analysis of the content being performed (Tappen, 1994).
Construct validity.Construct validity was assessed at individual and trial levels.Only one study was assessed on an individual level but was rated to have insufficient quality.Rockwood et al. (1996) found that GAS scores were found to correlate moderately with ADAS-Cog (a measure of cognitive ability, r = .52)and GDS (Global Deterioration Scale, r = .63)but not with MMSE (Mini mental state examination, r = .004)(Rockwood et al., 1996).Thus, for GAS, COPM and BGSI, there was insufficient evidence to draw a conclusion on an individual level.On a trial level, there was overall sufficient evidence of moderate-quality for the construct validity of COPM, GAS, BGSI and an unspecified measure, see Table 4.It is expected that the mean GAS scores of two randomised groups receiving effective or non-effective interventions will differ in favour of the group receiving the effective intervention.All but one study found this to be the case.Rockwood et al. (1996) found no significant difference between groups (p = .54),but the study was exploratory with a small sample size and GAS was still concluded to be the most responsive measure with the largest effect size (.61) and relative efficacy (.47).Four studies (Hindle et al., 2018;Leroi et al., 2014;Rockwood et al., 2006;Tappen, 1994) found evidence of trial level construct validity but had small sample sizes.Three remaining studies (Clare et al., 2010(Clare et al., , 2019;;Regan et al., 2017) using the COPM or BGSI showed good methodological quality and strong evidence of good construct validity.
Reliability.The only type of reliability that was assessed was inter-rater reliability in one study.A pilot RCT testing a cognitive rehabilitation intervention (Hindle et al., 2018;Watermeyer et al., 2016) used an independent researcher to code a subsample of the qualitative data set from the BGSI goal setting discussions and create their own goal rating.They found excellent inter-rater reliability (Krippendorff's alpha = .95)and agreement was 95.7%.Feasibility and interpretability.One of the earliest studies did not report on feasibility (Tappen, 1994) but all proceeding RCTs reported that GAS, COPM and BGSI were feasible goal-setting measures for both facilitators and users, with all included participants able to set at least one goal.Across the RCTs, the mean number of goals was approximately three per user.Researchers using GAS met with the participants in clinical settings and those using COPM and BGSI met participants in their own homes (all in person).Only one RCT mentioned how long the goal setting process took (2 hours over 3 visits) and suggests this extra time allowed more efficient follow up interviews (Rockwood et al., 1996).No studies mention any cost associated with these outcome measures.There was a very low rate of missing data for the goal measures suggesting good interpretability.The highest percentage of missing data was 9.4% (or 12 participants in placebo group) who did not complete GAS scores at 6 and 8 months (Rockwood et al., 2006).

Discussion
We identified four main goal setting measures being used as outcomes in this population: GAS, BGSI, COPM and IPPA.GAS, BGSI and COPM were used in RCTs, and using an adapted methodology based on COSMIN, we found moderate quality evidence for sufficient feasibility and validity, but reliability needs to be further assessed.
A central part of all three measures is the identification of goals.The BGSI and COPM focus on an initial interview in which facilitators help users identify and set goals.The COPM is based on the Canadian Model of Occupational Performance and Engagement (COPM-E) in which the client centred approach is central (McColl et al., 2005).COPM was developed for occupational therapy clinics, while the BGSI has been developed primarily as a research tool based on the concept of motivational interviewing and the social cognitive theory of behaviour change (Clare et al., 2012).The GAS studies used different approaches to identify goals; one used the COPM method (Ciro et al., 2014), others used similar interview techniques to the COPM and BGSI (Boots et al., 2016(Boots et al., , 2017;;Rockwood et al., 2006), and others used a goal inventory for clients to select goals from predefined goal areas (Jennings et al., 2018;Leroi et al., 2014;Rapaport et al., 2021).GAS differs from the other goal setting measures due to the formulation of individualised scoring scales when setting the goal.People living with dementia and family carers define in their own words what the baseline level behaviour or situation would look like it was to improve or get worse to form the 5point scale (much worse to much better).This increases the complexity of goal setting but has the benefit that a highly personalised outcome measure is produced.
We found evidence of very good or adequate content and construct validity for GAS, BGSI and COPM in RCTs.An important consideration in evaluating content validity is assessing whether the target population was involved in setting the goals and whether the goals were reviewed by one or more independent experts (Gaasterland et al., 2016(Gaasterland et al., , 2019)).The fact that people living with dementia are encouraged to share their preferences is important for the clinical relevance of these measures.Where people living with dementia lack capacity, studies often asked the family carer to help set the goals.Although this allows for family carer bias, the goals are still highly relevant, as they are usually the person most able to understand the people living with dementia's preferences if they no longer have capacity to express these (de Vugt et al., 2003).We agree with the conclusion of Bouwens et al. (2008) that family carers should not only help set the goals for the people living with dementia but should also set goals relevant to themselves.
All identified goal setting measures in this study have been facilitated by either clinically trained individuals or trained and supervised research assistants.It is vital that the goals selected are relevant for the intervention, and that this should be evaluated by an expert in the intervention content (Gaasterland et al., 2019).Although it is agreed across studies that training in facilitating the measures is important, there is a lack of detail in what the training entailed.Rockwood et al. (2006) outlined that they provided 4 hours of training for health professionals and Rapaport et al. (2021) reported that the study team received 2 days of training by GAS experts.The GREAT RCT detailed an initial two-day training course, annual refresher training days and monthly supervision for optimising the goal setting process (Clare et al., 2019).Future studies should outline what training was provided, the background experience of the facilitators and levels of supervision or goal review process (if any) provided to ensure studies set suitable goals which are central to the validity of the measure.
Another important aspect of content validity was to assess whether the goals set were SMART and if any content analysis was carried out on the set goals.Determining whether putative goals are realistic may be especially challenging for people living with dementia, and access to resources must be carefully considered.Ensuring goals are SMART is explicitly written into the guidance for COPM and BGSI.The GAS methodology has evolved to include the importance of setting well defined SMART goals at baseline (Rockwood et al., 1996;Tappen, 1994).The level of content analysis performed on set goals varied between studies, but it is recommended that goal content analysis is completed where goal domains, areas and descriptors are clearly outlined (Gaasterland et al., 2019).
The construct definition and the score meaning is crucial to determine if change is effectively measured on an individual level.It is therefore easier to assess construct validity on a trial level where two groups can be compared and where it is expected that the change scores of the two randomised groups will differ in favour of the one receiving the effective intervention (Gaasterland et al., 2019).All included RCTs formulated specific hypotheses and provided an adequate description of the intervention for construct validity to be assessed.
Reporting on reliability of the goal setting measures was limited in the RCTs but more common in some of the non-RCT studies.Although reliability contributes to the accuracy of findings, it cannot be a substitute for validity (Zumbo et al., 2014).Only one RCT study demonstrated excellent inter-rater reliability of the BGSI (Hindle et al., 2018;Watermeyer et al., 2016) by having a second independent researcher complete the goal content analysis of baseline goals.One aspect of evaluating reliability is determining whether the time intervals were appropriate or not.Goal outcome measures have been used up to twelve months follow up in previous nonrandomised studies (Boots et al., 2017;Jennings et al., 2018;Judge et al., 2011;Rockwood et al., 2002) for PWLD but it is not known what would be deemed an inappropriate time interval for these measures for people living with dementia.GAS has been used within RCTs with this population up to 6 months (Chester et al., 2021;Ciro et al., 2014) and BGSI has been used up to 9 months, but the primary outcome was still set at 3 months (Clare et al., 2019).Future work could explore using these goal measures over different time periods to determine the optimal time for follow ups which is likely to be dependent on dementia severity.This work would help further establish the reliability and feasibility of these outcomes and determine when goals are no longer relevant or of insignificant importance to the users.
The identified outcome measures have been used for interventions aimed at people living with dementia and/or family carers across a wide variety of dementia diagnosis types, mainly of mildmoderate severity although some studies have included people of all severities (often in care home settings; (Petyaeva et al., 2018)).In studies that set goals with people living with dementia and family carers it is sometimes unclear what the level of people living with dementia engagement was.When adapting these goal setting systems, it should be considered if there are ways to maximise the contribution of people living with dementia.A study in Japan used the Aid for Decision-making in Occupation Choice (ADOC) measure (Tomori et al., 2012) to ask people living with dementia to select 20 activities from 95 illustrations of daily activities which was then reduced to the 5 most important.The use of illustrations or other adapted methods in the goal selection process is unexplored.All the goals set and rated with participants in this review have been done in person, face to face within clinics, care homes or the participants' own homes.At the time of writing, the COVID-19 pandemic resulted in a shift in how people interact and so it would be timely to explore the psychometric properties of goal setting measures completed via remote methods too.
Goal setting outcomes enables research to use personalised outcomes to assess the efficacy of interventions, but they can also serve to tailor interventions and be part of the intervention.Goal setting is a crucial aspect in rehabilitation settings for example (Turner-Stokes et al., 2018).Rapaport et al. (2021) used GAS as both a primary outcome measure but also to directly inform the psychosocial intervention (NIDUS-Family) for PWLD and their family carers.There are several benefits to setting goals, including helping family carers and people living with dementia communicate priorities and needs (Jennings et al., 2018), keeping motivation high leading to better performance and prolonged effort as well as improving people's sense of self-efficacy (Locke & Latham, 2002).Further work in validating and establishing the best methodology of goal setting outcome measures may also have benefits to evaluate person-centred care within clinical and social care settings.Future reviews may also consider including studies that recruited paid/professional home care workers since we know how crucial a dedicated and adequate supported home-care workforce is to quality care (Carter, 2016).
Although we include studies from four continents, participants were all from high-income countries and were generally highly educated which limits the generalisability of the findings.Future studies should look to implement goal outcome measures with people living with dementia and family carers with lower socioeconomic status and educational attainments.One of the major advantages of goal setting measures is that they can account for differing cultural norms and languages in a way other standardised questionnaires cannot.
While we followed methods developed in previous studies, there is still a lack of agreement or a standardised way to assess the psychometric properties of goal outcome measures.There are limitations to trying to adapt the COSMIN guidance to apply to these unique types of outcome measures.We were only able to evaluate limited psychometric properties in a limited number of RCTs.With each measurement only being used by 1-3 RCTs we are unable to assess the influence of any of the psychometric properties.Further RCTs within dementia research utilising goal setting outcome measures will be crucial for further evaluation.

Conclusion
This study shows there is adequate evidence of content and construct validity and feasibility for GAS, BGSI, COPM being used as goal-setting measures for people living with dementia and family carers in RCTs.There is good evidence of inter-rater reliability for BGSI in one RCT, but reliability is not tested in other RCTs.We are not able to conclude on the use of one measure over another but suggest that GAS, BGSI and COPM have different strengths.The BGSI and COPM provide good guidance on an effective approach to goal identification interviews while GAS provides a detailed and personalised scaling system that is designed to be particularly sensitive to change.The flexibility and adaptability of goal setting measures can be beneficial for dementia researchers, as shown by the studies which developed their own goal attainment systems.
A key feature of goal setting outcome measures is that it enables people living with dementia and family carers to select goals within variable life domains that can reflect the high multidimensionality of dementia and that can be selected to fit the intervention, project or setting.There is no recommended guide of how to use GAS with people living with dementia so training and practice in how to set goals with this population and their carers is important.Further development and recommendations for facilitator training could be a beneficial way to ensure individualised personcentred outcomes are more widely used in RCTs while also allowing further evaluation of the psychometric properties of these measures.

Figure 2 .
Figure 2. Pie chart showing who goals were set with via % of the included studies (n = 30).

Figure 3 .
Figure 3. Pie chart showing study setting via % of the included studies (n = 30).

Figure 4 .
Figure 4. Pie chart type of interventions tested via % of the included studies (n = 30).
Sedigheh Zabihi, MSc, is a research assistant at University College London, Division of Psychiatry.Her research is focused on developing interventions to support people living with dementia at home and investigating how lifestyle and behavioural change can prevent dementia in older people.Anna Olsen, MSc, is a research assistant at University College London, Division of Psychiatry.Her research is focused on developing interventions to support people living with dementia at home and investigating how lifestyle and behavioural change can prevent dementia in older people.Professor Claudia Cooper, PhD MRCPsych, is a Professor of Psychological Medicine and Centre Lead at the Centre for Psychiatry and Mental Health at Queen Mary University of London and is a Consultant Old Age Psychiatrist in East London NHS Foundation Trust memory services.Claudia leads the Alzheimer's Society Centre of Excellence for Independence at home, in which they are developing interventions to support people living with dementia at home and investigating how lifestyle and behavioural change can prevent dementia in older people.Claudia is a member of the UK Cabinet Office Trial Advice Panel, which advises and supports evaluations of national government programmes and policies.

Table 2 .
Outline of the study populations, Interventions, goal setting outcome measure methods, follow up periods and reported findings and measurement properties in the included studies.

Table 2 .
(continued) a GAS score calculated using the standard formula[11].
Table 2 outlines the study characteristics and outcome measure characteristics.Figures 2-4 describe included studies in terms of who set the goals with facilitators, the study setting, and the types of interventions being tested.

Table 4 .
Overall rating and quality of evidence of goal outcome measures in RCTs